Speaker verification based on broad phonetic categories
نویسندگان
چکیده
In this work we present a speaker verification system based on 4 broad phonetic categories: vowels+diphthongs, fricatives, glides+nasals, and silence+stops. Using these categories separately, it is observed that vowels, diphthongs, and fricatives are the most important categories for speaker verification. This observation confirms the results from the analysis of speaker and channel variability in speech. Using NIST speaker verification evaluation data, the performance of the phone based system is compared with the conventional speaker verification system based on Gaussian mixture model (GMM). The results show that the phone-based system outperforms the conventional system specifically when there is channel mismatch between training and testing data.
منابع مشابه
Speaker Recognition and Broad Phonetic Groups
The aim of this study is to provide a quantitative assessment of the speaker discriminating properties of broad phonetic groups. GMM based approach to speaker modelling is used in conjunction with a phonetically handlabelled speech database (TIMIT) to produce broad phonetic group ranking based on speaker identification scores. The broad phonetic groups nasals and vowels were found to be particu...
متن کاملMixture of Auto-Associative Neural Networks for Speaker Verification
The paper introduces a mixture of auto-associative neural networks for speaker verification. A new objective function based on posterior probabilities of phoneme classes is used for training the mixture. This objective function allows each component of the mixture to model part of the acoustic space corresponding to a broad phonetic class. This paper also proposes how factor analysis can be app...
متن کاملInvestigation of Frame Alignments for GMM-based Text-prompted Speaker Verification
The frame alignment acts as an important role in GMM-based speaker verification. In text-prompted speaker verification, it is common practice to use the transcriptions to align speech frames to phonetic units. In this paper, we compare the performance of alignments from hidden Markov model (HMM) and deep neural network (DNN), using the same training data and phonetic units. We incorporate a pho...
متن کاملSpeaker verification with limited enrollment data
New methods for speaker veri cation that address the problems of limited training data and unknown telephone channel are presented. We describe a system for studying the feasibility of telephone based voice signatures for electronic documents that uses speaker veri cation with a xed test phrase but very limited data for training speaker models. We examine three methods for speaker veri cation t...
متن کاملGaussian mixture modelling of broad phonetic and syllabic events for text-independent speaker verification
This paper examines the usefulness of a multilingual broad syllable-based framework for text-independent speaker verification. Syllabic segmentation is used in order to obtain a convenient unit for constrained and more detailed model generation. Gaussian mixture models are chosen as a suitable modelling paradigm for initial testing of the framework. Promising results are presented for the NIST ...
متن کامل